Introduction

The tables were created for two purposes: To this end, anyone not familiar with this topic is urged to read the following note with care, and preferably also to consult the associated briefing/tutorial.

The definitive statement of what a code point is supposed to represent is the description contained in column 5. The table only covers the code points from 160(decimal) upwards, because

It cannot be repeated too often that the many people who have displayed all 256 possible code points on their displays, and have assumed that what they see there is a definition of ISO8859-1, are very seriously misguided. Their web pages, no matter how well-intentioned, are highly confusing, and they only mislead others who have not yet understood the problem.

Authors are also recommended to refer to my report on browsers, which shows the extent to which some popular browsers support the mechanisms described here, and offers advice to authors on how to code their HTML for best results.

Entity name variants and test cases

This section includes the variants, genuine or bogus, of the entity names known to me, for those cases where more than one entity name is known for a particular glyph. See the character code briefing document for discussion of this issue. The character code test table, linked to the next section of the present document, contains only one of these entity names per character, and now conforms to the names that are given in the ISOnum and ISOdia entity sets as well as to the "Proposed Entities" list in the HTML2.0 Specification (RFC1866). uml and die are perfectly valid alternative names for the same glyph, intended to be used according to context: sadly, some browsers implement only the one, while some implement only the other, but as neither of them is of much use in HTML this could be regarded as a trivial problem. (Look them up in a good dictionary if you want to know the difference - or look up "dieresis" if it's a US dictionary.)
     Description                 entity name     test case
     -----------                 -----------     ---------
Umlaut mark or diaeresis            uml            ¨
                                    die            ¨
macron (overbar)                    macron         ¯on;
                                    macr           ¯
                                    hibar          &hibar;
degree                              degree         °ree;
                                    deg            °
cedilla                             Cedilla        ¸
                                    cedil          ¸

The following are not part of the ISO-8859-1 repertoire:

trade mark (TM)                     trade          ™
endash                              endash         &endash;
emdash                              emdash         &emdash;

"Non-white" space (shown in brackets for clarity):
                                    ensp           [ ]
                                    emsp           [ ]

some folks seem to think these are  enspace        [&enspace;]
                                    emspace        [&emspace;]

The Table

The columns of the table are as follows.
1,2,3: Code value, in hex, decimal and octal respectively
4: Entity name - note that HTML2 browsers generally honour only a subset of the entities from this list, as should be clear by using a few different browsers to view col.8 of the table.
5: Description of the associated character
6: The character itself, sent as an 8-bit character from the server
7: The &#number; representation of the code point
8: The &entity; representation as in col.4. If it is supported by your browser, this will produce the desired effect; if not, it seems most commonly to display as the &entity; sequence itself.
9: Comments
*M marks characters that are typically displayed wrongly by Macintosh-based browsers.

If the browser is behaving as desired, then columns 6, 7 and (where the browser supports it) column 8 should all be displaying the glyph appropriate to the description in column 5.

If your browser supports at least the basic elements of the HTML3 <TABLE> construct, you can view the TABLE format test document; any browser should be able to view the pre-formatted test document.

Technical note: the tables were created by executing a REXX script.

Addendum: non-break space test

This section tests your browser for its behaviour on one or several non-break spaces. To avoid complications, no kind of indenting or formatting is attempted here; just the plain test materials, left aligned, and nothing more.

First, ordinary (i.e non pre-formatted) text

For comparison, lines that use a single ordinary space are also shown:
|| no space at all
| | a single ordinary space
| | a single nbsp
|   | three nbsp
| | a numeric entity &#160;
|   | three of those
| | a single ordinary space again

Now the same thing inside PRE-formatted text

The layout is the same as before.
|| no space at all
| | a single ordinary space
| | a single nbsp
|   | three nbsp
| | a numeric entity &#160;
|   | three of those
| | a single ordinary space again

[Prev][Up] [P.P.E Home][Rag-Bag][Me][Email]

Original materials © Copyright 1994, 1995, 1996 A.J.Flavell & Glasgow University